Phrase-Level Combination of SMT and TM Using Constrained Word Lattice
نویسندگان
چکیده
Constrained translation has improved statistical machine translation (SMT) by combining it with translation memory (TM) at sentence-level. In this paper, we propose using a constrained word lattice, which encodes input phrases and TM constraints together, to combine SMT and TM at phrase-level. Experiments on English– Chinese and English–French show that our approach is significantly better than previous combination methods, including sentence-level constrained translation and a recent phrase-level combination.
منابع مشابه
Rich Linguistic Features for Translation Memory-Inspired Consistent Translation
We improve translation memory (TM)inspired consistent phrase-based statistical machine translation (PB-SMT) using rich linguistic information including lexical, part-of-speech, dependency, and semantic role features to predict whether a TM-derived sub-segment should constrain PB-SMT translation. Besides better translation consistency, for English-to-Chinese Symantec TMs we report a 1.01 BLEU po...
متن کاملStatistical Machine Translation without a Source-side Parallel Corpus Using Word Lattice and Phrase Extension
Statistical machine translation (SMT) requires a parallel corpus between the source and target languages. Although a pivot-translation approach can be applied to a language pair that does not have a parallel corpus directly between them, it requires both source–pivot and pivot–target parallel corpora. We propose a novel approach to apply SMT to a resource-limited source language that has no par...
متن کاملStatistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension
Statistical machine translation (SMT) requires a parallel corpus between the source and target languages. Although a pivot-translation approach can be applied to a language pair that does not have a parallel corpus directly between them, it requires both source–pivot and pivot–target parallel corpora. We propose a novel approach to apply SMT to a resource-limited source language that has no par...
متن کاملSyntax-aware Phrase-based Statistical Machine Translation: System Description
We present a variant of phrase-based SMT that uses source-side parsing and a constituent reordering model based on word alignments in the word-aligned training corpus to predict hierarchical block-wise reordering of the input. Multiple possible translation orders are represented compactly in a source order lattice. This source order lattice is then annotated with phrase-level translations to fo...
متن کاملTranslation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models
In this paper, we propose a novel translation model (TM) based cross-lingual data selection model for language model (LM) adaptation in statistical machine translation (SMT), from word models to phrase models. Given a source sentence in the translation task, this model directly estimates the probability that a sentence in the target LM training corpus is similar. Compared with the traditional a...
متن کامل